Exploring The Use Of Hybrid Similarity Measure For Author Name Disambiguation

نویسنده

Tasleem Arif

چکیده

Name disambiguation has become one of the hard to crack problem in a virtual setup. With each passing day more and more entities with identical features are emerging online making it quite difficult to distinguish them. Digital libraries face similar problems in differentiating publications of similar looking authors. This leads to incorrect attribution of publications, thus making the entire effort of indexing publications of individual authors ineffective. This paper proposes a two stage hybrid similarity computation mechanism that combines the best of both the worlds. The proposed method use a token-based similarity score in this first stage of comparison and based on the results of the first stage it uses a character-based similarity score in the second stage. Experimental results obtained on standard datasets indicate that the proposed technique shows a lot of improvements over the existing methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود صحت ابهام‌زدایی نام نویسنده با استفاده از خوشه‌بندی تجمّعی

Today, digital libraries are important academic resources including millions of citations and bibliographic essential information such as titles, author's names and location of publications. From the view of knowledge accumulation management, the ability to search fast, accurate, desired contents, has a great importance. The complexity and similarity in these resources cause many challenges and...

متن کامل

Author Name Disambiguation Using a New Categorical Distribution Similarity

Author name ambiguity has been a long-standing problem which impairs the accuracy of publication retrieval and bibliometric methods. Most of the existing disambiguation methods are built on similarity measures, e.g., “Jaccard Coefficient”, between two sets of papers to be disambiguated, each set represented by a set of categorical features, e.g., coauthors and published venues. Such measures pe...

متن کامل

A Template Based Hybrid Model for Chinese Personal Name Disambiguation

This paper proposes a template based hybrid model for Chinese Personal Name Disambiguation (CPND). The template makes use of the features of personal role such as discriminating personal name (nickname, stage name), together with the specific context of most frequent words, personal name nearest words named entities, date and time that are effective for this disambiguation task, as well as surr...

متن کامل

Merging error analysis of name disambiguation based on author similarity

Falsely identifying different authors as one is called merging error in the name disambiguation of coauthorship networks. Research on the measurement and distribution of merging errors helps to collect high quality coauthorship networks. In the aspect of measurement, we provide a Bayesian model to measure the errors through author similarity. We illustratively use the model and coauthor similar...

متن کامل

Scaling Author Name Disambiguation with CNF Blocking

An author name disambiguation (AND) algorithm identifies a unique author entity record from all similar or same publication records in scholarly or similar databases. Typically, a clustering method is used that requires calculation of similarities between each possible record pair. However, the total number of pairs grows quadratically with the size of the author database making such clustering...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Exploring The Use Of Hybrid Similarity Measure For Author Name Disambiguation

نویسنده

چکیده

منابع مشابه

بهبود صحت ابهام‌زدایی نام نویسنده با استفاده از خوشه‌بندی تجمّعی

Author Name Disambiguation Using a New Categorical Distribution Similarity

A Template Based Hybrid Model for Chinese Personal Name Disambiguation

Merging error analysis of name disambiguation based on author similarity

Scaling Author Name Disambiguation with CNF Blocking

عنوان ژورنال:

اشتراک گذاری